Speaker adaptation of context dependent deep neural networks based on MAP-adaptation and GMM-derived feature processing
نویسندگان
چکیده
In this paper we propose a novel speaker adaptation method for a context-dependent deep neural network HMM (CD-DNNHMM) acoustic model. The approach is based on using GMMderived features as the input to the DNN. The described technique of processing features for DNNs makes it possible to use GMM-HMM adaptation algorithms in the neural network framework. Adaptation to a new speaker can be simply performed by adapting an auxiliary GMM-HMM model used in calculation of GMM-derived features and can be regarded as adaptation in the feature space for a DNN system. In this work, traditional maximum a posteriori adaptation is performed for an auxiliary GMM-HMM model. Experiments show that the proposed adaptation technique can provide, on average, a 5%-36% relative word error reduction on different adaptation sets under supervised adaptation setup, compared to speaker independent (SI) CD-DNN-HMM systems. In addition, several multi-stream combination techniques are examined in order to improve the performance of the baseline SI model.
منابع مشابه
GMM-derived features for effective unsupervised adaptation of deep neural network acoustic models
In this paper we investigate GMM-derived features recently introduced for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. We improve the previously proposed adaptation algorithm by applying the concept of speaker adaptive training (SAT) to DNNs built on GMM-derived features and by using fMLLR-adapted features for training an auxiliary GMM model. Traditional...
متن کاملOn the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models
In this paper we investigate the Gaussian Mixture Model (GMM) framework for adaptation of context-dependent deep neural network HMM (CD-DNN-HMM) acoustic models. In the previous work an initial attempt was introduced for efficient transfer of adaptation algorithms from the GMM framework to DNN models. In this work we present an extension, further detailed exploration and analysis of the method ...
متن کاملImproved feature processing for deep neural networks
In this paper, we investigate alternative ways of processing MFCC-based features to use as the input to Deep Neural Networks (DNNs). Our baseline is a conventional feature pipeline that involves splicing the 13-dimensional front-end MFCCs across 9 frames, followed by applying LDA to reduce the dimension to 40 and then further decorrelation using MLLT. Confirming the results of other groups, we ...
متن کاملOn Improving Acoustic Models for TORGO Dysarthric Speech Database
Assistive technologies based on speech have been shown to improve the quality of life of people affected with dysarthria, a motor speech disorder. Multiple ways to improve Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network (DNN) based automatic speech recognition (ASR) systems for TORGO database for dysarthric speech are explored in this paper. Past attempts in develop...
متن کاملDependence of GMM adaptation on feature post-processing for speaker recognition
This paper presents a study on the relationship between feature post-processing and speaker modelling techniques for robust text-independent speaker recognition. A fully coupled target and background Gaussian mixture speaker model structure is used for hypothesis testing in this speaker model based recognition system. Two formulations of the Maximum a Posteriori (MAP) adaptation algorithm for G...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014